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1 Introduction 

In this lecture I shall present a class of mathematical tools for modeling phe- 
nomena that can be described in terms of elementary interacting objects. The 
goal is to make the macroscopic behavior arise from individual dynamics. I shall 
denote these individuals with the term automaton, in order to emphasize the 
main ingredients of the schematization: parallelism and locality. In my opinion, 
a good microscopic model is based on a rule that can be executed in parallel 
by several automata, each of which has information only on what happens in 
its vicinity (that can extend arbitrarily). In the following I shall use the word 
automaton either to refer to a single machine or to the whole set of machines 
sharing the same evolution rule. 

These are completely discrete models: the time increases by finite steps, 
the space is represented by a regular lattice, and also the possible states of the 
automaton (the space variables) can assume one out of a finite set of values. The 
reason for this choice is the conceptual simplicity of the description. A single real 
number requires an infinity of information to be completely specified, and since 
we do not have any automatic tool (i.e. computer) that efficiently manipulates 
real numbers, we have to resort to approximations. On the other hand, these 
discrete models can be exactly implemented on a computer, which is a rather 
simple object (being made by humans). Since the vast majority of present 
computers are serial ones (only one instruction can be executed at a time), the 
parallel nature of the model has to be simulated. 
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The goal of a simple microscopic description does not imply that one can- 
not use real numbers in the actual calculations, like for instance in mean field 
approximations . 

The class of phenomena that can be described with automata models is very 
large. There are real particle-like objects, such as atoms or molecules (from a 
classical point of view), that can be used to model the behavior of a gas. But 
one can build up models in which the automata represents bacteria in a culture 
or cells in a human body, or patches of ground in a forest. 

In reality there are two classes of automata, one in which the automata can 
wander in space, like molecules of a gas, and another one in which the automata 
are stick to the cell of a lattice. I shall call the first type molecular automata, 
and the second cellular automata. [] 

Probably the first type is the most intuitive one, since it resembles actual 
robots that sniff around and move. Each class has its own advantages, in term 
of simplicity of the description. A molecular automaton has information about 
its identity and its position. It can be used to model an animal, its state 
representing for instance the sex and the age. It is quite easy to write down a rule 
to make it respond to external stimuli. However, one runs into troubles when 
tries to associate a finite spatial dimension to this automaton. Let us suppose 
that the automaton occupies a cell on the lattice that represents the space. 
The evolution rule has to decide what happens if more than one automaton 
try to occupy the same cell. This is very hard to do with a true parallel, 
local dynamics, and can involve a negotiation which slows down the simulation. 
Clearly, a possible solution is to adopt a serial point of view: choose one of 
the automata and let it move. This is the approach of the computations based 
on molecular dynamics, where one tries to study the behavior of ensembles 
of objects following Newton's equations, or in Monte Carlo calculations. This 
serial approach is justified when time is continuous, and the discretization is 
just a computational tool, since for a finite number of objects the probability 
of having two moves at the same instant is vanishing, but indeed it is not a 
very elegant solution. Thus, molecular automata are good for point-like, non- 
excluding particles. 

The other solution is to identify the automata with a point in space: on each 
cell of the lattice there is a processor that can communicate with the neighboring 
ones. If we say that a state represents the empty cell and another the presence 
of a bacterium, we associate to it a well defined portion of space. From this 
point of view, the bacterium is nothing but a property of space. The usual name 
for this kind of models is cellular automata. 

People from physics will realize that cellular automata correspond to a field- 
like point of view, while molecular automata correspond to a particle point of 
view. In the last example above, only one particle can sit at a certain location 
(cell) at a certain time. Thus, we described a Fermion field. One can allow 

x See also the contribution by N. Boccara, this volume. 
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also an arbitrary number of particles to share the same position. This is a 
Boson field, in which particles loose their individuality. Following our analogy 
with elementary particles, we could say that molecular automata correspond to 
classical distinguishable particles. 

The Boson field represents also the link between molecular and cellular au- 
tomata. Indeed, if we relax the need of identifying each particle, and if we allow 
them to share their state (i.e. their identity and their position), then the two 
kinds of automata coincide. This suggests also an efficient way to simulate a 
molecular automata. Let us assume that every particle follows a probabilistic 
rule, i.e. it can choose among several possibilities with corresponding probabili- 
ties. Consider the case in which there are several identical particles at a certain 
position. Instead of computing the fate of each particle, we can calculate the 
number of identical particles that will follow a certain choice. If the number of 
identical particles at a certain position is large, this approach will speed up very 
much the simulation. 

In the following I shall concentrate on cellular automata. They have been 
introduced in the forties by John von Neumann Q, a mathematician that was 
also working on the very first computers. Indeed, cellular automata represent a 
paradigm for parallel computation, but von Neumann was rather studying the 
logical basis of life. The idea of the genetic code was just arising at that time, 
so we have to take into consideration the cultural period. 

From a mathematical point of view, he realized that the reproduction process 
implies that the organism has to include a description of itself, that we now 
know to be the genetic code. On the other hand, the chemistry of real world is 
too complex, and also the mechanics of a robot is not simply formalized. The 
solution was to drastically simplify the world, reducing it to a two-dimensional 
lattice. The result of these studies was a cellular automaton with the power of a 
general-purpose computer (a Turing machine) , and able to read the information 
to reproduce itself. This solution was quite complex (each cell could assume 
one out of 26 states), but now we have several simplified versions. One of 
these is not ably based on the Game of Life, a cellular automaton described in 
Section 4.1 For a review of these topics, see Sig mund (1993) [|. 



In spite of their mathematical interests, cellular automata has been quiescent 
for nearly 30 years, until the introduction of the John Conway's Game of Life in 
the columns of Scientific American ||, around 1970. The Game of Life is a two- 
dimensional cellular automaton in which the cells can assume only two states: 
for dead and 1 for live. Looking at the evolution of this cellular automaton an 
the display of a fast computer is quite astonishing. There are fermenting zones 
of space, and Life propagates itself by animal-like structures. From a prosaic 
point of view, there are interesting mathematical questions to be answered, as 



suggested in Section 4.1 



Finally, cellular automata exited the world of mathematicians around 1984, 
when the journal Physica dedicated a whole issue Q to this topic. A good review 
of the first application of cellular automata can also be found in Wolfram's 
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Figure 1: The k = 1 and k = 1/2 neighborhoods in d = 1. 



collection of articles. || 

In the following section I shall consider the mathematical framework of cel- 
lular automata, after which I shall review some applications of the concept. 

2 Cellular Automata 

Before going on we need some definitions. I shall denote the spatial index 
i and the temporal index t. Although we can have automata in a space of 
arbitrary dimension d, it is much simpler for the notations to consider a one- 
dimensional space. The state of a cell o\ at position i and at time t can assume a 
finite number of states. Again for simplicity I consider only Boolean automata: 
tj\ G {0, 1}. Assume N as the spatial dimension of the lattice. The state of the 
lattice can be read as a base-two number with N digits. Let me denote it with 
a* = (a\, . . . , a%). Clearly <x € {0, 2 N - 1}. 

The evolution rule <r t+1 = F(a t ) can in general be written in terms of a 
local rule 

a^ 1 ^f{al k ,...,al...,al +k ), (1) 

where k is the range of the interactions and the boundary conditions are usually 
periodic. The rule / is applied in parallel to all cells. The state of a cell 
at time t + 1 depends on the state of 2k + 1 cells at time t which constitute 
its neighborhood. We consider here the simplest neighborhoods: the k = 1 
neighborhood that in d = 1 consists of three cells and the k = 1/2 neighborhood, 
that in d = 1 consists of two cells and can be considered equivalent to the 
previous neighborhood without the central cell. A schematic diagram of these 
neighborhoods is reported in Fig. [I]. 

2.1 Deterministic Automata 

The Game of Life and von Neumann's automaton are deterministic ones, i.e. 
once given the initial state the fate is in principle known, even though it can 

2 In this contribution I shall try to keep notation constant. I shall denote vectors, matrices 
and vectorial operators by bold symbols, including some Boolean functions of an argument 
that can take only a finite number of values (like a Boolean string), when referred as a whole. 
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Table 1: Example of Wolfram's code for rule 22. 
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take a lot of effort. 

A compact way of specifying the evolution rule for k = 1, d = 1 cellular 
automata has been introduced by S. Wolfram ||. It consists in reading all 
possible configuration of the neighborhood {o\_i, cr*, of, x ) as a base- two number 
n, and summing up w n — 2™ multiplied by cr* +1 , as shown in Table [l] for the 
rule 22. 

This notation corresponds to the specification of the look-up table for the 
Boolean function that constitutes the evolution rule. For an efficient implemen- 
tation of a cellular automata one should exploit the fact that all the bits in a 
computer word are evaluated in parallel. This allows a certain degree of paral- 
lelism also on a serial computer. This approach is sometimes called multi-spin 
coding (see Section ||). In the following I shall use the symbols ©, A and V for 
the common Boolean operations exclusive OR (XDR), AND and OR. The negation 
of a Boolean variable will be indicated by a line over the variable. The AND 
operation will be denoted often as a multiplication (which has the same effect 
for Boolean variables). 

Let me introduce some terms that will be used in the following. If a Boolean 
function / of n variables a%, . . . ,a n is completely symmetric with respect to 
a permutation of the variables, than it depends only on the value of the sum 

di of these variables, and it is called totalistic. If the function is symmetric 
with respect to a permutation of the variables that correspond to the values of 
the cells in the neighborhood, but not to the previous value of the cell, than 
the function depends separately on the sum of the outer variables and on the 
previous value of the cells itself and the automaton is called outer totalistic. 

Deterministic cellular automata are discrete dynamical systems. Given a 
certain state <x at a certain time, the state at time t+ 1 is perfectly determined. 
The ordered collection of states (cr , . . . , er', . . . ) represents the trajectory of the 
system. Since for finite lattices the number of possible states is also finite, only 
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Figure 2: The return map of rule 22. 



limit cycles are possible. There can be several possibilities: only one cycle, 
several small cycles, one big and several small, etc, where big and small refer to 
the length of the cycle. Moreover, one can consider the basin of attraction of a 
cycle, i.e. the number of states that will eventually end on it. 

All these dynamical quantities will change with the size of the lattice. A 
physicist or a mathematician is generally interested to the asymptotic behavior 
of the model, but there may be a well defined size more interesting than the 
infinite-size limit. A table with the cycle properties of the simplest cellular 
automata can be found at the end of Wolfram's collection. Q 

The study of the limit cycles is clearly limited to very small lattices, es- 
pecially in dimension greater than one. To extend the investigation to larger 
lattices, one has to resort to statistical tools, like for instance the entropy of the 
substrings (patches) of a certain size. 

If a Boolean string a, of length n appears with probability p(a) (there are 
2 n such strings), then the normalized n-cntropy S n is defined as 



S n ranges from 1, if all strings appear with the same probability p = 1/2™, to 
if only one string appears. 

One is generally interested in the scaling of S n with n. For an example of 
the application of this method, see Grassberger's papers on rule 22. || 

If one reads the configuration er* = 01001001110101 ... as the decimal digit 
of the fractional part of a base-two number x(t) (dyadic representation), i.e. 




a 




x(t) = 0.01001001110101... 



(3) 
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Figure 3: The attractor of rule 22. 

for an infinite lattice one has a correspondence between the points in the unit 
interval and the configurations of the automaton. One has thus a complete cor- 
respondence between automata and maps x(t+ 1) = f(x(t)) of the unit interval. 
The map / is generally very structured (see Fig. ||) . This correspondence helps 
to introduce the tools used in the study of dynamical systems. 

Another possibility is the plot of the attractor and the measure of its fractal 
dimension. This can be performed dividing the configuration in two parts, and 
reading the left (right) part as a decimal number x(t) (y(t)). The portrait of 
the attractor for rule 22 is given in Fig. |[ 

One can try to extend to these dynamical systems the concept of chaotic 
trajectories. In the language of dynamical systems a good indicator of chaotic- 
ity is the positivity of the maximal Lyapunov exponent, which measures the 
dependence of the trajectory with respect to a small change in the initial posi- 
tion. This concept can be extended to cellular automata in two ways. The first 
solution is to exploit the correspondence between configurations and point in 
the unit interval. At the very end, this reduces to the study of the dynamical 
properties of the map /. This solution is not very elegant, since it looses the 
democratic point of view of the system (each cell has the same importance) . 

Another possibility is to look at the state of the system as a point in very high 
dimensional Boolean space. Here the smallest perturbation is a change in just 
one cell, and this damage can propagate at most linearly for locally interacting 
automata. However, one can measure the instantaneous spreading of the damage 
(i.e. the spreading in one time step), and from here it is possible to calculate 
a Lyapunov exponent |7| . The automata that exhibit disordered patterns have 
indeed a positive exponent for almost all starting points (trajectories), while 
simple automata can have Lyapunov exponent ranging from — oo to positive 
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Figure 4: Lyapunov exponent for k = 1 cellular automata. The symbol /i 
denotes the instantaneous diverging rate of trajectories, and the solid line the 
results of a random matrix approximation. For further details, see Bagnoli et 
al. (1992). § 

values. As for deterministic dynamical systems one interprets the trajectories 
with negative Lyapunov exponents as stable ones, and those with a positive 
exponent as unstable ones. In the simulation of an usual dynamical system, 
rounding effects on the state variables lead to the disappearance of the most 
unstable trajectories, so that the dependence of Lyapunov exponents on the 
trajectories looks quite smooth. We can recover this behavior with our discrete 
dynamical systems by adding a small amount of noise to the evolution. A plot 
of the maximal Lyapunov exponent for elementary cellular automata is reported 
in Fig. H 

The only source of randomness for deterministic cellular automata is in the 
initial configuration. The counterpart of chaoticity is the dependence on the 
initial condition. For chaotic automata a small change in the initial configuration 
propagates to all the lattice. Measures of this kind are called damage spreading 
or replica symmetry breaking. 

It is convenient to introduce the difference field (damage) h\ between two 
configurations x\ and y\ 

h\ = x\@y\ (4) 
and the Hamming distance H(t) = l/ n 2iLi ^ n Fig- ||the spre adin g of the 



difference from a single site for rule 22 is reported (see also Section 3.3). 

Finally, instead of studying the diverging rate of two trajectories, one can 
measure the strength required to make all trajectories coalesce. Once again, 
this is an indicator of the chaoticity of the automaton. 



8 



space 




Figure 5: The temporal plot of the damage spreading for rule 22. 
2.2 Probabilistic Automata 

For probabilistic cellular automata the look-up table is replaced by a table of 
transition probabilities that express the probability of obtaining a\ +1 = 1 once 
given the neighborhood configuration. For k = 1/2 we have four transition 
probabilities 

r(0,0-f 1) 
r(0,l->l) 
r(l,0->l) 
r(M-»l) 

with r(o, b ->• 0) = 1 - r(a, b->l). 

The evolution rule becomes a probabilistic Boolean function. This can be 
written as a deterministic Boolean function of some random bits, thus allowing 
the use of multi-site coding. In order to do that, one has to write the determin- 
istic Boolean functions that characterize the configurations in the neighborhood 
with the same probability. For instance, let us suppose that the k = 1/2 automa- 
ton of equation (||) has to be simulated with po = 0, p\ = P2 = p and p% = q. The 
Boolean function that gives 1 only for the configurations (cr|_ l5 = (0,1) 
or (1,0) is 

X\{l)=<rl-i®°i + i (6) 

and the function that characterizes the configuration (1,1) is 

x *(2)=a*_ 1 Aa* +1 . (7) 

Let me introduce the truth function. The expression \expression\ gives 1 if 
expression is true and zero otherwise (it is an extension of the delta function). 



Po] 
Pi; 

P2\ 
P3, 



(5) 
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Figure 6: The application of the transfer matrix for the k = 1/2, d = 1, iV = 2 
cellular automata. 

Given a neighborhood configuration o"| +1 }, <t- +1 can be 1 with probability 

Pi. This can be done by extracting a random number < r < 1 and computing 

i 

Although the sum is not a bitwise operation, it can safely used here since 
only one out of the Xi can be one for a given configuration of the neighborhood. 
The sum can be replaced with a OR or a XOR operation. 

Given a certain lattice configuration a = cr 1 at time t, the local transition 
probabilities allow us to compute the transition probability T^ a from configu- 
ration b to configuration a as 

N 

T ba = n T ( a i-i> a i+i -> b i)- ( 9 ) 
1=1 

The factorization of the matrix T implies that it can be written as a product 
of simpler transfer matrices M that add only one site to the configuration. 
Periodic boundary conditions require some attention: in Fig. ^ an example of 
decomposition T = LM N R for the simplest case N — 2 is shown. The lattice 
has been skewed for the ease of visualization. 

This transition matrix defines a Markov process. 

The state of the system at a certain time t is indicated by the probability 
la of observing the configuration a. Clearly, one has J2a x< a = 1- For a 
deterministic automata only one component of x is one, and all other are null. 
The time evolution of x is given by the applications of the transfer matrix T. 
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The conservation of probability implies that J2a ^ba = ^ ^ ^ s P oss ible to prove 
that the maximum eigenvalue Ai of T is one; the eigenvector corresponding to 
this eigenvalue is an asymptotic state of the system. 

If all configurations are connected by a chain of transition probabilities (i.e. 
there is always the possibility of going from a state to another) than the asymp- 
totic state is unique. 

The second eigenvalue A2 gives the correlation length £: 

e = -(lnA 2 )- 1 . (10) 

In our system two different correlation length can be defined: one in the 
space direction and one in the time direction. The temporal correlation length 
gives the characteristic time of convergence to the asymptotic state. 

The usual equilibrium spin models, like the Ising or the Potts model, are 
equivalent to a subclass of probabilistic cellular automata. In this case the 
transition probability between two configurations a and b is constrained by the 
detailed balance principle 

I^ = exp(f]H(b)-(3H(a)), (11) 
1 ab 

where H(a) is the energy of configuration a and (3 is the inverse of the tem- 
perature. One can invert the relation between equilibrium spin models and 
probabilistic cellular automata Q and reconstruct the Hamiltonian from the 
transition probabilities. In general, a cellular automata is equivalent to an equi- 
librium spin system if no transition probabilities are zero or one. Clearly in this 
case the space of configurations is connected. 

These systems can undergo a phase transition, in which some observable 
displays a non-analytic behavior in correspondence of a well defined value of a 
parameter. The phase transition can also be indicated by the non-ergodicity 
of one phase. In other words in the phase where the ergodicity is broken the 
asymptotic state is no more unique. An example is the ferromagnetic transition; 
in the ferromagnetic phase there are two ensembles of states that are not con- 
nected in the thermodynamic limit. In the language of the transfer matrix this 
implies that there are two (or more) degenerate eigenvalues equal to one. The 
corresponding eigenvectors can be chosen such that one has positive components 
(i.e. probabilities) for the states corresponding to one phase, and null compo- 
nents for the states corresponding to the other phase, and vice versa. Indeed, 
the degeneration of eigenvalues correspond to the divergence of the (spatial and 
temporal) correlation lengths. 

It is well known that equilibrium models cannot show phase transitions at 
a finite temperature (not zero nor infinite) in one dimension. However, this is 
no more true if some transition probabilities is zero or one violating the de- 
tailed balance Eq. (^). In effect, also the one dimensional Ising model exhibits a 
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Figure 7: Transfer Matrix M for N = 2. 



phase transition at a vanishing temperature, i.e. when some transition probabil- 
ities become deterministic. In particular one can study the phase transitions of 
models with adsorbing states, that are configurations corresponding to attract- 
ing points in the language of dynamical systems. An automata with adsorbing 
states is like a mixture of deterministic and probabilistic rules. A system with 
adsorbing states cannot be equivalent to an equilibrium model. 

Let me introduce here an explicit model in order to be more concrete: the 
Domany-Kinzel model ||. This model is defined on the lattice k = 1/2 and the 
transition probabilities are 



r(0,0- 


-1) 


- 0; 


r(0,l- 


-1) 


= p; 


r(l,0- 


-1) 


= p; 


r(l,l- 


-1) 


= q- 



(12) 



The configuration 0, in which all cells assume the value zero, is the adsorbing 
state. Looking at the single site transfer matrix M (reported in Fig. ^ for the 
simple case N — 2), one can see that (for p.q < 1) every configuration has a 
finite probability of going into the configuration 0, while one can never exit this 
state. In this simple probability space a state (i.e. a vector) v = (vq, . . . ,1)7) 
that corresponds to the single configuration a is given by va = S a b' ^ nc 
configuration corresponds to the state given by the vector u/ 1 ) = (1, 0, 0, . . . ). 

Let me indicate with the symbols Aj, i = 1, . . . , 8 the eigenvalues of M, with 
||Ai|| > HA2II > • • • > ||Ag||. The corresponding eigenvectors are . . . , w^ 8 \ 

Let us suppose that they form a base in this space. The Markovian character of 
M implies that the maximum eigenvalue is 1, corresponding to the eigenvector 

A generic vector v can be written as 

v = ai«j (1) + a 2 w (2) H (13) 

If we start from the vector v at time t = and we apply T times the transfer 
matrix T, in the limit of large N this is practically equivalent to the application 
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Figure 8: Modulus of the eigenvectors of the N — 2 transfer matrix for p = q. 



of the matrix M NT times, and we get 

v(T, N) = M NT v(0, N) = ai \? T wW + a 2 A^V 2 > + ... (14) 

and since all eigenvalues except the first one are in norm less that one, it follows 
that the asymptotic state is given by the vector ttiW (i.e. by the absorbing 
configuration 0). 

This situation can change in the limit N — > oo and (after) T — > oo (the 
thermodynamic limit) . In this limit some other eigenvalues can degenerate with 
Ai, and thus one or more configurations can survive forever. The existence 
of such phase transitions can be inferred from the plot of the modulus of the 
eigenvalues of M for N = 2. This is reported in Fig. || for the case p = q. In 
this finite lattice the degeneration with the eigenvalue Ai (which corresponds to 
the configuration in which all sites take value 1) occurs for p = 1. As discussed 



in Section 4.2 in the thermodynamic limit the transition occurs at p = 0.705 .... 
A rather different scenario exhibits for q = 0. In this case, as reported in Fig. ||, 
all eigenvalues degenerate with the first (for p = 0.81 ... in the thermodynamic 



limit) and this implies a different dynamical behavior (see Section 4.2) 



2.2.1 Critical Phenomena 

The already cited phase transitions are a very interesting subject of study by 
itself. We have seen that the correlation length £ diverges in the vicinity of a 
phase transition. The correlation between two sites of the lattice at distance 
r is supposed to behave as exp(— r/£), the divergence of £ implies very large 
correlations. Let us suppose that there is a parameter p that can be varied. In 
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Figure 9: Modulus of the eigenvectors of the N = 2 transfer matrix for q = 0. 

the vicinity of the critical value p c of this parameter the correlation length di- 
verges as £ <~ (p — Pc)~ v , where v is in general non integer. Also other quantities 
behaves algebraically near a critical point, like for instance the magnetization p 
which scales as p <~ (p — p c Y ■ 

This power-law behavior implies that there is no characteristic scale of the 
phenomena. Indeed, if we change the scale in which the parameter p is measured, 
the proportionality factor will change but the form of the law will not. The 
pictorial way of explaining this phenomena is the following: suppose we have 
some two-dimensional system near to a phase transition where one phase is 
white and the other is black. There will be patches and clusters of all sizes. 
If we look at this specimen by means of a TV camera, the finest details will 
be averaged on. Now let us increase the distance from the TV camera and the 
specimen (of infinite extension). Larger and larger details will be averaged out. 
If the system has a characteristic scale £, the picture will change qualitatively 
when the area monitored will be of order of the square of this length or more. 
On the other hand, if we are unable to deduce the distance of the TV camera 
from the surface by looking at the image, the system is self-similar and this is a 
sign of a critical phenomena. Another sign is the slow response to a stimulation: 
again, the distribution of the response times follows a power-law. 

The critical phenomena are very sensible to the parameter p, and a small 
change in it will destroy this self-similarity. In spite of their non-robustness, 
critical phenomena are heavily studied because their universality: since only 
large scale correlations are important, the details of the rule do not change 
the exponents of the power laws. This implies that the critical points of very 
different systems are very similar. 

Self-similar objects are very common in nature. One example is given by 
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Figure 10: The lattice for the diffusion in d = 1. 



clouds: it is almost impossible to evaluate the distance from a cloud, even when 
one is relatively near to it. And also power laws are found in many different 
fields, so are slow response times. It is impossible that we always meet usual 
critical phenomena, which are so delicate. It has been proposed jl0| that open 
systems (i.e. out of equilibrium system) can auto-organize into a self-organized 
critical state, a state that exhibits the characteristic of a critical point still being 
stable against perturbations. It is indeed possible to develop simple models that 
exhibit this featu re, but it is not clear how ubiquitous this quality is. The Game 
of Life of Section 4J is a very simple example of a Self Organized Critical (SOC) 
model. 



2.2.2 Diffusion 

The implementation of diffusion in the context of cellular automata is not very 
straightforward. One possibility is of course that of exchanging a certain num- 
ber of pairs of cells randomly chosen in the lattice. While this approach clearly 
mixes the cells, ^| it is a serial method that cannot be applied in parallel. A more 
sophisticated technique is that of dividing the lattice in pieces (a one dimen- 
sional lattice can be divided in couples of two cells, a two dimensional lattice 
in squares of four cells), and then rotate the cell in a patch according with a 
certain probability. This is equivalent to considering more complex lattices, but 
always with a local character, as shown in Fig. |l0| for d = 1. The diffusion can 
be controlled by a parameter D that gives the probability of rotating the block. 
One has also to establish how many times the procedure has to be repeated 



'For an application of this technique, refer to the contribution by Boccara, this volume. 
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(changing the blocks). A detailed analysis of the effects of this procedure can 
be found in Chopard and Droz (1989) Q. 



3 Numerical techniques 

In this section I shall review some techniques that I currently use to investigate 
cellular automata. 



3.1 Direct simulations 

The first approach is of course that of assigning each cell in the lattice to a 
computer word, generally of type integer, and to write the evolution rule using 
if . . . instructions. This generally produces very slow programs and waste a 
large amount of memory (i.e. the size of the lattices are limited). 

Another possibility is to use look-up tables. In this case one adds the value 
of the cells in the neighborhood multiplying them by an appropriate factor (or 
packs them into the bits of a single word), and then uses this number as an 
index in an array built before the simulation according with the rule. 

Alternatively, one can store the configurations in patches of 32 or 64 bits 
(the size of a computer word). This implies a certain degree of gymnastic to 
make the patches to fit together at boundaries. 

Again, since often one has to repeat the simulation starting from different 
initial configurations, one can work in parallel on 32 or 64 replicas of the same 
lattice, with the natural geometry of the lattice. When using probabilistic cel- 
lular automata (i.e. random bits), one can use the same input for all the bits, 
thus making this method ideal for the damage spreading investigations (see 



Section 3.3) 



Finally, one can exploit the natural parallelism of serial computers to perform 
in parallel simulations for the whole phase diagram, as described in Bagnoli et 
al. (1997). @ 

In order to use the last three methods, all manipulations of the bits have 
to be done using the bitwise operations AND, OR, XDR. It is possible to obtain a 
Boolean expression of a rule starting from the look-up table (canonical form), 
but this generally implies many operations. There are methods to reduce the 
length of the Boolean expressions. O, Il4) 



3.2 Mean Field 

The mean field approach is better described using probabilistic cellular au- 
tomata. Given a lattice of size L, its state is defined by the probability Xa of 
observing the configuration a. If the correlation length £ is less than L, two cell 
separated by a distance greater that £ are practically independent. The system 
acts like a collection of subsystems each of length £. Since £ is not known a 
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priori, one assumes a certain correlation length I and thus a certain system size 
L, and computes the quantity of interest. By comparing the values of these 
quantities with increasing L generally a clear scaling law appears, allowing to 
extrapolate the results to the case L — > oo. 

The very first step is to assume 1 = 1. In this case the I — 1 cluster 
probabilities are 7i"i(l) and 7Ti(0) = 1 — 7r(l) for the normalization. The I = 1 
clusters are obtained by I = 1 + 2k clusters via the transition probabilities. 

For the k = 1/2 Domany-Kinzel model we have 

<(1) = (7r 2 (Q > l)+7r2(l,0)p + 7r 2 (l,l)«. , s 

ni(0) = ^2(0,0) + (^2(0,l) + 7r 2 (l,0)(l-j5)+7r 2 (l,l)(l- g ), 1 0) 

where n = and n' — n( t+1 \ 

In order to close this hierarchy of equation, one factorizes the I = 2 prob- 
abilities. If we call p = 7Ti(l), we have 7r 2 (0, 1) = 7r 2 (l,0) = p(l — p) and 
7T 2 (1, 1) = p 2 . The resulting map for the density p is 

pt = 2pp+(q-2p)p 2 . (16) 

The fixed points of this map are p — and p — 2p/(2p — q). The stability of 
these points is studied by following the effect of a small perturbation. We have 
a change in stability (i.e. the phase transition) for p = 1/2 regardless of q. 

The mean field approach can be considered as a bridge between cellular 
automata and dynamical systems since it generally reduces a spatially extended 
discrete system to a set of coupled maps. 

There are two ways of extending the above approximation. The first is still to 
factorize the cluster probabilities at single site level but to consider more time 
steps, the second is to factorize the probabilities in larger clusters. The first 
approach applied for two time steps implies the factorization of 1^3(0, b, c) = 
7i"i (a)7Ti (6)7Ti (c) and the map is obtained by applying for two time steps the 
transition probabilities to the I = 3 clusters. The map is still expressed as 
a polynomial of the density p. The advantage of this method is that we still 
work with a scalar (the density), but in the vicinity of a phase transition the 
convergence towards the thermodynamic limit is very slow. 

The second approach, sometimes called local structure approximation |15|| , 
is a bit more complex. Let us start from the generic I cluster probabilities ~k\. 
We generate the I — 1 cluster probabilities 717 _i from 7r; by summing over one 
variable: 

7r/_i(ai, . . . , a;_i) = n(ai, ■ ■ ■ , a;_i,a/). (17) 

0,1 

The I + 1 cluster probabilities are generated by using the following formula 

7T/(oi, ...,a/)7r/(o2,...,a/+i) , . 

7r/+i(ai,a 2 , . . . ,ai,ai +1 ) = . (18) 

7R-i(a 2 , . . . , ai) 
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Finally, one is back to the I cluster probabilities by applying the transition 
probabilities 



This last approach has the disadvantage that the map lives in a high- 
dimensional (2') space, but the results converges much better in the whole phase 
diagram. 

This mean field technique can be considered an application of the transfer 
matrix concept to the calculation of the the eigenvector corresponding to the 
maximum eigenvalue (fundamental or ground state). 

3.3 Damage spreading and Hamming distance 

In continuous dynamical systems a very powerful indicator of chaoticity is the 
Lyapunov exponent. The naive definition of the (maximum) Lyapunov exponent 
is the diverging rate of two initially close trajectories, in the limit of vanishing 
initial distance. 

This definition cannot be simply extended to discrete systems, but we can 
define some quantities that have a relation with chaoticity in dynamical systems. 

First of all we need a notion of distance in discrete space. The natural 
definition is to count the fraction of corresponding cells that have different 
value (I consider here only Boolean cellular automata). If we indicate with x 
and y the two configurations, the distance h between them (called the Hamming 
distance) is 



It is possible to define an equivalent of a Lyapunov exponent (see Fig. [|), 
but the natural application of the Hamming distance is related to the damage 
spreading. 

For deterministic cellular automata the damage is represented by the Ham- 
ming distance between two configurations, generally starting from a small num- 
ber of damaged sites, and the goal is to classify the automata according with 
the average speed of the damage or other dynamical quantities. 

The extension to probabilistic cellular automata is the following: what will 
happen if we play again a rule with a different initial configuration but the same 
realization of the noise! If the asymptotic state changes, than the evolution 
remembers the initial configuration, otherwise it is completely determined by 
the noise. In the language of dynamical systems, this two scenarios correspond 
to a breaking of replica (the two configurations) symmetry. We can also define 
the difference field hi = a;, © yi with h representing its density. The breaking of 




(19) 



6i,...,6j +1 i=l 




(20) 
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replica symmetry thus corresponds to a phase transition for h from the adsorbing 
state h — to a chaotic state. 

4 Investigation Themes 

The theory described in the last section gives us the analytic tools and con- 
stitutes an interesting subject of study by itself. However, cellular automata 
can be studied as phenomenological models that mimic the real world from a 
mesoscopic point of view. The point of view of a physicist is again that of look- 
ing for the simplest model still able to reproduce the qualitative behavior of the 
original system. For this reason the models described in this section will be very 
crude; if one wants a model that mimics the system as good as possible, one can 
start from a simplified model and add all the features needed. An advantage 
of cellular automata with respect to system of differential or partial differen- 
tial equations is the stability of dynamics. Adding some feature or interactions 
never leads to structural instabilities. 

This section is of course not exhaustive. 

4.1 Life 

The Game of Life was introduced in the '70s by John Conway and then pop- 
ularized by Martin Gardner in the columns of Scientific American ||. It is a 
two dimensional, Boolean, outer totalistic, deterministic cellular automata that 
in some sense resembles the evolution of a bacterial population. The value 
is associated to an empty or dead cell, while the value 1 to a live cell. The 
automaton is defined on a square lattice and the neighborhood is formed by the 
nearest and next-to-nearest neighbors. The evolution rule is symmetric in the 
values of the cells in the outer neighborhood, i.e. it depends on their sum. The 
transition rules are stated in a pictorial way as 

• If a live cell is surrounded by less than two live cell, it will die by isolation. 

• If a live cell is surrounded by more that three live cells, it will die by 
overcrowding. 

• Otherwise, if a live cell is surrounded by two or three live cells, it will 
survive. 

• An empty cell surrounded by three live cells will become alive. 

• Otherwise an empty cell will stay empty. 

The evolution of this rule is impressive if followed on the screen of a fast 
computer (or a dedicated machine). Starting from a random configuration with 
half cell alive, there is initially a rapid drop in the density /?', followed by a long 
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Figure 11: The most common animals in Life. The figures represent the average 
abundance of each animal. 



phase of intense activity. Q After some hundreds time steps (for lattices of size 
order 200 x 200) there will emerge colonies of activity separated by nearly empty 
patches. In these patches there are small stable or oscillating configurations (the 
animals, see Fig. hi). Some of these configurations can propagate in the lattice 
(the gliders). The activity zones shrink very slowly, and sometimes a glider 
will inoculate the activity on a quiescent zone. Finally, after hundreds time 
steps, the configuration will settle in a short limit cycle, with density p ~ 0.028. 
The scaling of the relaxation time with the size of the lattice suggests that in 
an infinite lattice the activity will last forever (see Fig. |lj). The existence of 
a non-vanishing asymptotic density has been confirmed by recent simulations 
performed by Gibbs and Stauffer (1997) [|~7j on very large lattices (up to 4 x 10 10 
sites). 

There are several questions that have been addressed about the Game of 
Life. Most of them are mathematical, like the equivalence of Life to a universal 
computer, the existence of gardens of heaven, the feasibility of a self-reproducing 
structure in Life, and so on (see Sigmund (1993) ||). 

From a statistical point of view, the main question concerns the dependence 
of the asymptotic state from the density of the initial configuration (supposed 
uncorrelated) and the response time of the quiescent state. 

For the first question, in Fig. [l^ is reported the plot of the asymptotic density 
versus the initial density for a lattice square lattice with L = 256. One can see 
a transition of the value asymptotic density with respect to the initial density 
for a value of the latter around 0.05. In my opinion this effect is due to the 

4 An exhaustive analysis of the Game of Life can be found in Bagnoli et al (1992). 
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Figure 12: The plot of the asymptotic density p°° vs. the initial density p° in 
the Game of Life. The curves represents different lattice sizes: 96 x 50 (broken 
line), 160 x 100 (dashed line), 320 x 200 (full line). 

finite size of the lattice, since there is a slight dependence on the system size. 
However, recent simulations performed by Stauffer Jl8| on very large lattices 
still exhibit this effect. 

The response time can be measured by letting Life to relax to the asymptotic 
state, and then perturbing it in one cell or adding a glider. Recording the 
time that it takes before relaxing again to a periodic orbit and plotting its 
distribution, Bak et al. (1989) |nj found a power law, characteristic of self 
organized critical phenomena. One can investigate also the dynamical properties 
of the Game of Life, for instance the spreading of damage. Its distribution will 
probably follow a power law. 

Another possibility is to investigate the asymptotic behavior using mean 
field techniques. The simplest approximations are unable to give even a rough 
approximation of the asymptotic density p ~ 0.028, so it is worth to try more 
sophisticated approximations as the local structure one. 

4.2 Epidemics, Forest Fires, Percolation 

We can call this class of processes contact processes. They are generally repre- 
sentable as probabilistic cellular automata. Let me give a description in terms 
of an epidemic problem (non lethal) 

In this models a cell of the lattice represents an individual (no empty cells), 
and it can stay in one of three states: healthy and susceptible (0), ill and 
infective (1), or immune (2). 
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Figure 13: The mean field approximations for the Game of Life. 

Let us discuss the case without immunization. This simple model can be 
studied also in one spatial dimension (a line of individual) + time. 

Clearly the susceptible state is adsorbing. If there are no ill individual, the 
population always stays in the state 0. We have to define a range of interactions. 
Let us start with k = 1, i.e. the state of one individual will depend on its 
previous state and on that of its nearest neighbors. There is an obvious left- 
right symmetry, so that the automata is outer totalistic. The probability of 
having er* +1 = 1 will depend on two variables: the previous state of the cell erf 
and the sum of the states of the neighboring cells. Let me use the totalistic 
characteristic functions xlU) that take the value one if the sum of the variables 
in the neighborhood (here a\_ 1 and <x|, x ) is j and zero otherwise (see Eq. (g, 0)). 
Moreover, I shall denote er| with a' and I shall neglect to indicate the spatial 
and temporal indices. Then 

^ = /(^,x(i)). (21) 

The function/ gives the probability that a' is ill (1) given its present state 
and the number of ill neighbors. If / does not depend on a (i.e. the probability 
of contracting the infection is the same of staying ill for a given number of ill 
neighbors) we have again the Domany-Kinzel model Eq. (|l^). The parameters 
of this model are here the probability of contracting the infection or staying ill 
when an individual is surrounded by one (p) or two (q) sick individuals. The 
phase diagram of the model is reported in Fig. [l5| 

As one can see, there are two phases, one in which the epidemics last forever 
(active phase) and one in which the asymptotic state is only formed by healthy 
individuals (quiescent phase). 
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Figure 14: The temporal behavior of the density in the Game of Life. 

A feature of this model is the presence of a region in the phase diagram 
with spreading of damage (active phase). This zone corresponds to p q, i.e. 
it is due to an interference effect between the neighbors. ^| This is obviously 
not realistic for a generic infection, but it can have some application to the 
spreading of a political idea. 

One can also study the case in which if an individual is surrounded by a 
totality of ill individuals, it will certainly contract the infection (q = 1). This 
means that also the state in which all the cells have value one is adsorbing. In 
this case the transition is sharp (a first order transition). 

The phenomenology enriches if we allow the previous value of the cell a to 
enter the evolution function. Let us consider the a simplest case: a totalistic 
function of the three cells. Let us call p, q and w the probability of having 
a 1 = 1 if x(l); x(2) or x(3) is one, respectively. If we set w — 1 we have two 
adsorbing states and two parameters, so that the phase diagram is again two 
dimensional and easy to visualize (see Fig. |l6| ). This model will be studied in 
detail in the contribution by Bagnoli, Boccara and Palmerini, this volume. 

We see that in this case we can have both first and second order phase 
transitions. This observation could have some effective importance, in that a 
first order phase transition exhibit hysteresis, so that if the system enters one 
adsorbing state, it would cost a large change in the parameter to have it switch 
to the other state. On the other hand, a second order phase transition is a 
continuous one, with no hysteresis. 

The model can be extended to larger neighborhoods, immune states, more 
dimensions and diffusion, always looking to the phase diagram and to damage 

5 See also Bagnoli (1996). j2cj] 
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Figure 15: The phase diagram of the density (left) and damage (right) for the 
Domany-Kinzel model. (0 < p < 1, < g < 1, iV = 1000, T = 1000). The gray 
intensity is proportional to the value of the density, ranging from white to black 
in equally spaced intervals. 



spreading. The mathematical tools that can be used are direct simulations (with 
the fragment technique) and various mean field approximations (once more, the 
local structure technique is quite powerful). 

4.3 Ecosystems 

I shall use the following definition of an ecosystem: an evolving set of inter- 
acting individuals exposed to Darwinian selection. We start from the simplest 
model: let us consider an early ecosystem, composed by haploid individuals 
(like bacteria) . Each individual can sit on a cell of a lattice in d space dimen- 
sions. Each individuals is identified by its genetic information, represented as 
an integer number x. The genotype can be read as a four symbol (bases) or 
codon string. One could also consider the genome composed by alleles of a set 
of genes. I shall use here a binary coding (say, two alleles), because it is simpler 
to describe. All individuals have a genome of the same length I. Thus, we can 
interpret i as a base-two number representing the genotype or as an abstract 
index. The difference enters when we consider the mutations and thus we have 
to introduce the concept of distance in the genetic space. ^] In the first case 
the genotype space is an hypercube with 2 l dimensions, in the second it is an 
abstract space of arbitrary dimension, that for simplicity we can consider one 
dimensional. Finally, we have to introduce the phcnotipic distance, which is the 
difference in phenotipic traits between two individuals. Given a genotype x, the 
phcnotype is represented as g{x). In order to simplify the notations, I shall dc- 

6 In effects, the distance in the genetic space is defined in terms of the number of mutations 
needed to connect (along the shortest path) two individuals (or, more loosely, two species). 
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Figure 16: The phase diagram for the k = 1 totalistic cellular automaton with 
two adsorbing states. 

note a function of the phenotype (say h(g(x)) as h[x]. The phenotypic distance 
will be denoted as S(g(x), g(y)) — 5[x, y]. I do not consider the influence of age, 
i.e. a genotype univocally determines the phenotype. There could be more than 
one genotype that give origin to the same phenotype (polymorphism). 

This automaton has a large number of states, one for each different genome 
plus a state (*) for representing the empty cell. Thus * represents an empty 
cell, is the genome (0,0,..., 0), 1 is the genome (0, . . . , 0, 1) and so on. The 
evolution of the automaton is given by the application of three rules: first 
of all the interactions among neighbors are considered, giving the probability 
of surviving, then we perform the diffusion step and finally the reproduction 
phase. I shall describe the one (spatial) dimensional system, but it can be 
easily generalized. 

survival: An individual at site i in the state x\ ^ * and surrounded by 2k + 1 
(including itself) neighbors in the states {xi^k, ■ ■ ■ , ^i+fc} = {xi}k has a 
probability n(xi, {xi}k) of surviving per unit of time. This probability is 
determined by two factors: a fixed term h[xt] that represents its ability of 
surviving in isolation, and an interaction term l/(2k+l)^^i i _ fe J[xi, Xj\. 
Chearly, both the static fitness and the interaction term depend on the 
phenotype of the individual. 

The fixed field h and the interaction matrix J define the chemistry of 
the world and are fixed (at least in the first version of the model). The 
idea is that the species x with h[x] > represent autonomous individu- 
als that can survive in isolation (say, after an inoculation into an empty 
substrate), while those with h[x] < represents predators or parasites 
that necessitate the presence of some other individuals to survive. The 
distinction between autonomous and non-autonomous species depends on 
the level of schematization. One could consider an ecosystem with plants 
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and animals, so that plants are autonomous and animals are not. Or one 
could consider the plants as a substrate (not influenced by animals) and 
then herbivores are autonomous and carnivores are not. The interaction 
matrix specifies the necessary inputs for non autonomous species. 

We can define the fitness H f\ of the species Xi in the environment {xi}k 
as 

j i+k 

H(xi, {xi} k ) = h[x. L ] + 2fc - - ^2 J [ x U x i\- ( 22 ) 

j=i-k 

The survival probability n(H) is given by some sigma-shaped function of 
the fitness, such as 

e pH 1 1 

<h) = Yf^m = 2 + 2 tanh (^)> ( 23 ) 

where (3 is a parameter that can be useful to modulate the effectiveness 
of selection. 

The survival phase is thus expressed as: 

• If Xi ^ * then 

x' i = Xi with probability ir(H(xi, fr)A \ 

x\ — * otherwise ^ ' 

• Else 

x' i =x i = * (25) 
diffusion: The diffusion is given by the application of the procedure described 



in Section 2.2.2. The disadvantage of this approach is that we cannot in- 
troduce nor an intelligent diffusion (like escaping from predators or chasing 
for preys), nor different diffusion rates for different species. An alterna- 
tive could be the introduction of diffusion on a pair basis: two cells can 
exchange their identity according with a given set of rules. ^| 

For the moment we can use the following rule: 

• Divide the lattice in neighboring pairs (in d = 1 and no dependence 
on the neighbors there are two ways of doing that) 



7 There are several definition of fitness, the one given here can be connected to the growth 
rate A of a genetically pure population by A = exp(H); see also Section [4.4] 

8 There are several ways of dividing a lattice into couples of neighboring cells. If one want 
to update them in parallel, and if the updating rule depends on the state of neighboring cells, 
one cannot update a couple at the same time of a cell in the neighborhood. Since one has to 
update a sublattice after another, there is a slightly asymmetry that vanishes in the limit of 
a small diffusion probability iterated over several sublatticcs. 
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• Exchange the values of the cells with probability D 

Clearly the probability D could depend on the value of the cells and on 
that of neighbors. 

reproduction: The reproduction phase can be implemented as a rule for empty 
cells: they choose one of the neighbors at random and copy its identity 
with some errors, determined by a mutational probability u>. 

Thus 

• If the cell has value * then 

— choose one of the neighbors; 

— copy its state; 

— for each bits in the genome replace the bit with its opposite with 
probability cj; 

• Else do nothing. 

We have now to describe the structure of the interacting matrix J. I shall 
deal with a very smooth correspondence between phenotype and genotype: two 
similar genotypes have also similar phenotype. With this assumptions x repre- 
sents a strain rather than a species. This hypothesis implies the smoothness of 



First of all I introduce the intraspecies competition. Since the individuals 
with similar genomes are the ones that share the largest quantity of resources, 
then the competition is stronger the nearer the genotypes. This implies that 
J[x,x] < 0. 

The rest of J is arbitrary. For a classification in terms of usual ecological 
interrelations, one has to consider together J[x,y] and J[y, x]. One can have 
four cases: 



One has two choices for the geometry of the genetic space. 

hypercubic distance: The genetic distance between x and y is given by the 
number of bits that arc different, i.e. d(x, y) = \ \x ffi where the norm 
of a Boolean vector x is defined as ||x|| = (1/N) Yli=i bi- 
linear distance: The genetic space is arranged on a line (excluding the and 
with periodic boundary conditions) rather than on a hypercube. This 
arrangement is simpler but less biological; ^| 

9 An instance of a similar (sub-)space in real organisms is given by a repeated gene (say a 



J. 



J[x,y] < 

J[x, y] > 

J[x, y] < 

J[x,y] > 



J[y,x] < 
J[y,x] < 
J[y,x] > 
J[y,x] > 



competition 
predation or parasitism 
predation or parasitism 
cooperation 
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4.4 Mean field approximation 

Since the model is rather complex, it is better to startj^] with the mean field 
approximation, disregarding the spatial structure. This approximation becomes 
exact in the limit of large diffusion or when the coupling extends on all the 
lattice. 

Let n(x) be the number of organisms with genetic code x, and n* the number 
of empty sites, if N is the total number of cells, we have 

n* + n(x) = N; 

X 

m = — V^nfa;) = 1 -; 

X 

considering the sums extended to all " not-empty" genetic codes, and indicating 
with m the fraction of non-empty sites (i.e. the population size). 

We shall label with a tilde the quantities after the survival step, with a prime 
after the reproduction step. The evolution of the system will be ruled by the 
following equations: 



h(x) — ir(x,ri)n(x); (26) 
n'(x) = n(x) + j^^W{x,y)h(y). 



The matrix W(x, y) is the probability of mutating from genotype y to x. For a 
given codon mutation probability n, W(x,y) is given by 



hypercubic distance: 



W(x, y) = n d ^' v \l - fi) l - d ^; (27) 



linear distance: 



W(x,y) = n \i\x-y\ = l\ 

W(x,x) = 1-2/x; (28) 

W(x, y) = otherwise. 



tRNA gene): a fraction of its copies can mutate, linearly varying the fitness of the individual 
with the "chemical composition" of the gene. pi[ Th is degenerate case has been widely studied 
(see for instance Alves and Fontanari (1996) The linear space is equivalent to the 

hypercubic space if the phenotype g(x) does not depend on the disposition of bits in x (i.e. 
it depends only on the number of ones — a totalistic function) : one should introduce in this 
case the multiplicity of a degenerate state, which can be approximated to a Gaussian, but if 
one works in the neighborhood of its maximum (the most common chemical composition) the 
multiplicity factors are nearly constants. Another example is given by the level of catalytic 
activity of a protein. A linear space has also been used for modeling the evolution of RNA 
viruses on HeLa cultures. [123] 

10 We also stop in this lecture with this approximation. 

11 Since the original model is still in development, I shall not study the exact mean field 
version of the above description, but a simplified one. 
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In any case the mutations represent a diffusion propcess in genie space, thus 
the matrix W conserves the total population, and 

J2w(x,y) = J2w(x,y) = l. (29) 

x y 

Summing over the genomes x, we can rewrite Eq. ( p6|) for m as 

™ = T7 h ^ = Tr n ( x > n ) n ( x ) 



N ^— ' N 

X 

±Y / n'(x)=m+^ 2 



rri = — ^n'(x) = m+ — W(x,y)n(y). 



XI) 



Introducing the probability distribution of occupied sites p(x) — jj^U Q2 X P( X ) = 



1), and the average fitness it as 

7f = — i— Y k{x, n)n(x) = Y tNpW, 



a: a; 

and thus 

_ n* _ 

m = TO7T; = 1 — 7717T. 

TV 

Using the property (^9|) we obtain 

m = rn7f(2 — m7r). (30) 
The stationary condition (to' = m) gives 

1 = 7f(2 — TO7f); 
2tF- 1 



The normalized evolution equation for p(x) is thus 

7r(a;,p,m)p(a;) + (l-rn7f)X; i ,M / (a;,y)7r(a;,P,™)j3(y) 

fW= =7^ b: ; 31 

7r(z — rmr) 

where the usual fitness function, Eq. (p2|) , has to be written in term of p{x) 

H(x,p,m) = /i[i]+m^J[i,!/]p(!/) 

v 

Notice that Eq. (pOj) corresponds to the usual logistic equation for population 
dynamics if we keep the average fitness W constant. 
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4.5 Speciation due to the competition among strains 



One of the feature of the model is the appearance of the species intended as a 
cluster of strains connected by mutations. Q 

One can consider the following analogy with a Turing mechanism for chem- 
ical pattern formation. The main ingredients are an autocatalytic reaction pro- 
cess (reproduction) with slow diffusion (mutations) coupled with the emission 
of a short-lived, fast-diffusing inhibitor (competition). In this way a local high 
concentration of autocatalytic reactants inhibits the growth in its neighborhood, 
acting as a local negative interaction. 

In genetic space, the local coupling is given by the competition among ge- 
netically kin individuals. For instance, assuming a certain distribution of some 
resource (such as some essential metabolic component for a bacterial popu- 
lation), then the more genetically similar two individuals are, the wider the 
fraction of shared resources is. The effects of competition on strain x by strain 
y are modeled by a term proportional to the relative abundance of the latter, 
p(y), modulated by a function that decreases with the phenotypic distance be- 
tween x and y. Another example of this kind of competition can be found in the 
immune response in mammals. Since the immune response has a certain degree 
of specificity, a viral strain x can suffer from the response triggered by strain 
y if they are sufficiently near in an appropriate phenotypic subspace. Again, 
one can think that this effective competition can be modeled by a term, pro- 
portional to the relative abundance of the strain that originated the response, 
which decreases with the phenotypic distance. 

Let us start with a one dimensional "chemical" model of cells that reproduce 
asexually and slowly diffuse (in real space), p — p(x,t) being their relative 
abundance at position x and at time t. These cells constitutively emit a short- 
lived, fast-diffusing mitosys inhibitor q — q(x,t). This inhibitor may be simply 
identified with some waste or with the consumption of a local resource (say 
oxygen) . The diffusion of the inhibitor is modeled as 



where k$, k\ and D are the production, annihilation and diffusion rates of q. 
The evolution of the distribution p is given by 



m =kQP + D d^- kiq > 




(32) 



dp 

~dt 



(A(x,t)-A(t))p + n 



d 2 p 
dx 2 ' 



(33) 




(34) 



The growth rate A can be expressed in terms of the fitness H as 



A(x,t) = exp(H(x,t)) . 



(35) 
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Due to the form of equation (63), at each time step we have 



X>(M) = 1- (36) 

X 

The diffusion rate of q, D, is assumed to be much larger than \x. The growth 
rate A, can be decomposed in two factors, A(x,t) = Aq(x)Ai (q(x, i)), where 
Aq gives the reproductive rate in absence of q, so ^4i(0) = 1. In presence of a 
large concentration of the inhibitor q the reproduction stops, so Ai(oo) = 0. A 
possible choice is 

A(x, t) = exp(H (x) - q(x, £)). 

For instance, Hq(x) could model the sources of food or, for algae culture, the 
distribution of light. 

Since we assumed a strong separation in time scales, we look for a stationary 
distribution q(x,t) of the inhibitor (Eq. (j§2|)) by keeping p fixed. This is given 
by a convolution of the distribution p: 



q(x,t) = J J exp y- — P{y,t)dy, 



where J and R depend on the parameters kg, ki, D. In the following we shall 
use J and R as control parameters, disregarding their origin. 

We can generalize this scenario to non-linear diffusion processes of the in- 
hibitor by using the reaction-diffusion equation Eq. (|33|), with the fitness H and 
the kernel K given by 

H{x, t) = H (x)-J J K (^jp) P(y, t)dy (37) 



^(r)=exp^-^) ) (38) 

i.e. a symmetric decreasing function of r with K(Q) = 1. The parameters J and 
a control the intensity of the competition and the steepness of the interaction, 
respectively. 

Let us consider the correspondence with the genetic space: the quantity 
x now identifies a genome, the diffusion rate /x is given by mutations, and the 
inhibitor q (which is no more a real substance) represents the competition among 
phenotypically related strains. The effects of competition are much faster than 
the genetic drift (mutations), so that the previous hypotheses are valid. While 
the competition interaction kernel K(r) is not given by a diffusion process, its 
general form should be similar to that of Eq. ( |38| ) : a decreasing function of the 
phenotypic distance between two strains. We shall refer to the p-independent 
contribution to the fitness, Hq[ the static fitness landscape. 
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Figure 17: Static fitness Hq, effective fitness H, and asymptotic distribution p 
numerically computed for the following values of parameters: a = 2, fi = 0.01, 
H Q = 1.0, b = 0.04, J = 0.6, R = 10 and r = 3. 



Our model is thus defined by Eqs. (|33| |38|) . We are interested in its asymp- 
totic behavior in the limit fj, — > 0. Actually, the mutation mechanism is needed 
only to define the genetic distance and to allow population of an eventual niche. 
The results should not change qualitatively if one includes more realistic muta- 
tion mechanisms. 

Let us first examine the behavior of Eq. ( p3[ ) in absence of competition 
(J = 0) for a smooth static landscape and a vanishing mutation rate. This 
corresponds to the Eigen model in one dimension: since it does not exhibit 
any phase transition, the asymptotic distribution is unique. The asymptotic 
distribution is given by one delta function peaked around the global maximum of 
the static landscape, or more delta functions (coexistence) if the global maxima 
are degenerate. The effect of a small mutation rate is simply that of broadening 
the distribution from a delta peak to a bell-shaped curve J2j| . 

While the degeneracy of maxima of the static fitness landscape is a very par- 
ticular condition, we shall show in the following that in presence of competition 
this is a generic case. For illustration, we report in Fig. [l?] the numerical compu- 
tation of the asymptotic behavior of the model for a possible evolutive scenario 
that leads to the coexistence of three species. We have chosen a smooth static 
fitness Hq (see Eq. (^9[)) and a Gaussian (a = 2) competition kernel, using the 
genetic distance in place of the phenotypic one. The effective fitness H is almost 
degenerate (here /x > and the competition effect extends on the neighborhood 
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of the maxima), and this leads to the coexistence. 

In the following we shall assume again that the phenotypic space is the same 
of the genotypic space. 



4.5.1 Evolution near a maximum 

We need the expression of p if a given static fitness H(x) or a static growth rate 
A(x) has a smooth, isolated maximum for x = (smooth maximum approxima- 
tion). Let us assume that 

A(x) ~ A {l~ax 2 ), (39) 

where A a = A(0). 

The generic evolution equation (master equation) for the probability distri- 
bution is 



a(t)p(x,t + l) = fi + M^a) A(x,p(t))p(x,t); 



(40) 



where the discrete second derivative S 2 /5x 2 is defined as 



Sx 2 



f(x + l) + f(x-l)-2f(x), 



and a(t) maintains the normalization of pit). In the following we shall mix 
freely the continuous and discrete formulations of the problem. 

Summing over x in Eq. ( |40| ) and using the normalization condition, Eq. (|36|), 
we have: 

a = J2 A (x,p)p(x)=A. (41) 

X 

The normalization factor a thus corresponds to the average fitness. The quan- 
tities A and a are defined up to an arbitrary constant. 

If A is sufficiently smooth (including the dependence on p), one can rewrite 
Eq. (Eof) in the asymptotic limit, using a continuous approximation for x as 



d 2 

ap = Ap + fj,— (Ap), (42) 



Where we have neglected to indicate the genotype index x and the explicit de- 
pendence on p. Eq. (^) has the form of a nonlinear diffusion-reaction equation. 
Since we want to investigate the phenomenon of species formation, we look for an 
asymptotic distribution p formed by a superposition of several non-overlapping 
bell-shaped curves, where the term non-overlapping means almost uncoupled 
by mutations. Let us number these curves using the index i, and denote each 
of them as Pi(x), with p(x) = J2iPi( x )- Each pi(x) is centered around Xi and 
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its weight is J pi(x)dx — ji, with Ylili = 1- We further assume that each 
Pi(x) obeys the same asymptotic condition, Eq. (this is a sufficient but not 
necessary condition). Defining 

Ai = — [ A(x)pi(x)dx = a, (43) 
1% J 

we see that in a stable ecosystem all quasi-species have the same average fitness. 

Substituting q = Ap in Eq. (|||) we have (neglecting to indicate the genotype 
index x, and using primes to denote differentiation with respect to it): 

a .. 
-rq = q + pq ■ 



Looking for q — exp(w), 



and approximating A 1 = Aq 1 (l + ax 2 ), we have 



A possible solution is 



2(7 

Substituting into Eq. (E3) we finally get 



-^-(1 + ax 2 ) = 1 + n(w' 2 +w"). (44) 
Aq 



a 2 + afi — y 4a p, + a 2 /j, 2 
~A~o = 2 • 



(45) 



Since a = A, a/Ao is less than one we have chosen the minus sign. In the limit 
a/i — * (small mutation rate and smooth maximum), we have 



A 
and 



a 

a/i 



a 2 * Jt. (46) 
V a 



The asymptotic solution is 



, f + ax 2 
p(x) = 7 , exp . ... 
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so that / p{x)dx = 7. The solution is a bell-shaped curve, its width a being 
determined by the combined effects of the curvature a of maximum and the 
mutation rate pL.. In the next section, we shall apply these results to a quasi- 
species i. In this case one should substitute p — > pt, 7 — > 7$ and x — > x — x;. 

For completeness, we study also the case of a sharp maximum, for which 
A(x) varies considerably with x. In this case the growth rate of less fit strains 
has a large contribution from the mutations of fittest strains, while the reverse 
flow is negligible, thus 

p{x - l)A(x - 1) > p{x)A{x) > p(x + l)A(x + 1) 

neglecting last term, and substituting q(x) = A{x)p(x) in Eq. ( |Io| ) we get: 

1-2/i forx = (47) 



.4 





q ^ = ( 17 /uo - 1} for x > (48) 

(aA(xj — 1 + 2/iJ 

Near x = 0, combining Eq. @, Eq. Qand Eq. @), we have 

= 71 — ~ X )' 

(1 — 2p)ax i 

In this approximation the solution is 

/' 



q(x) = 
and 



1 - 2/ia J (x!) 2 ' 



y(x) = A(x)g(x) ~ + ax 2 ) f * 

^4o \ aa J xr 

One can check the validity of these approximations by solving numerically 
Eq. (40); the comparisons are shown in Fig. [l^. One can check that the smooth 



maximum approximation agrees with the numerics for for small values of a, when 
A(x) varies slowly with x, while the sharp maximum approximation agrees with 
the numerical results for large values of a, when small variations of x correspond 
to large variations of A{x). 



4.5.2 Speciation 

I shall now derive the conditions for the coexistence of multiple species. Let 
us assume that the asymptotic distribution is formed by L delta peaks pk, 
k = 0, ...,L — 1, for a vanishing mutation rate (or L non-overlapping bell 
shaped curves for a small mutation rate) centered at yt- The weight of each 
quasi species is 7^, i.e. 

. L-l 

I Pk(x)dx = 7 fc , ^2 7fc = 1. 
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Figure 18: Average fitness a/Ao versus the coefficient a, of the fitness function, 
Eq. (|39|), for some values of the mutation rate \x. Legend: numerical resolution 
corresponds to the numerical solution of Eq. (fl(]|), smooth maximum refers to 
Eq. ([45|) and sharp maximum to Eq. (47) 



The quasi-species are ordered such as 70 > 71, . . . , > 7l-i- 
The evolution equations for the p k are (/x — ► 0) 

^ = (A(y k )-A)p k , 
where A(x) = exp (H(x)) and 

L-l 

H{x) =H Q (x)-jJ2 K 

3=0 

The stability condition of the asymptotic distribution is (A(y k ) — A)p k = 0, 
i.e. either A(yk) = A = const (degeneracy of maxima) or pt = (all other 
points). In other terms one can say that in a stable environment the fitness of 
all individuals is the same, independently on the species. 

The position y k and the weight 7^ of the quasi-species are given by A(y k ) — 
A = const and dA(x)/dx\ yk = 0, or, in terms of the fitness H, by 

Ho(yk) -JJ2 K [ ^ p Vl ) V = const 



R 
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3=0 

Let us compute the phase boundary for coexistence of three species for two 
kinds of kernels: the exponential (diffusion) one (a = 1) and a Gaussian one 
(a = 2). 

I assume that the static fitness H (x) is a symmetric linear decreasing func- 
tion except in the vicinity of x = 0, where it has a quadratic maximum: 

H Q (x) = 6 fl - M _ 1 ) (49) 

so that close to x = one has H (x) ~ —bx 2 /r 2 and for a; — » oo, H n (x) ~ 
6(1 — |a;|/r). Numerical simulations show that the results are qualitatively in- 
dependent on the exact form of the static fitness, providing that it is a smooth 
decreasing function. 

Due to the symmetries of the problem, we have one quasi-species at x = 
and two symmetric quasi-species at x = ±y. Neglecting the mutual influence 
of the two marginal quasi-species, and considering that Hq(0) = K'(0) = 0, 
K'{y/R) = —K'(—y/r), K(0) = J and that the three-species threshold is given 
by 7o = 1 and 71 = 0, we have 

»(i - f) - 'm = -1. 



PS*) 7,-0 



\ + K\y) = 0. 
r 

where y = y/R, r = r/R and b = b/J. I introduce the parameter G = f/b = 
(J/R)/(b/r), that is the ratio of two quantities, one related to the strength of 
inter-species interactions (J/R) and the other to intra-species ones (b/r). In the 
following I shall drop the tildes for convenience. Thus 

r — z — Gcxp ( — — j = — G, 



Gz^cxp = 1. 

For a = lwe have the coexistence condition 

ln(G) = r - 1 + G. 

The only parameters that satisfy these equations arc G = 1 and r = 0, i.e. 

a flat landscape (b = 0) with infinite range interaction (R = 00). Since the 
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Figure 19: Three-species coexistence boundary G c for a — 2. The continuous 
line represents the analytical approximation, Eq. (|5(i|), the circles are obtained 
from numerical simulations. The error bars represent the maximum error. 

coexistence region reduces to a single point, it is suggested that a = 1 is a 
marginal case. Thus for less steep potentials, such as power law decrease, the 
coexistence condition is supposed not to be fulfilled. 
For a — 2 the coexistence condition is given by 

z 2 -{G + r)z + 1 = 0, 



Gzexp I --J = 1. 

One can solve numerically this system and obtain the boundary G c (r) for the 
coexistence. In the limit r — > (static fitness almost flat) one has 

G c (r) ~ G c (0) - r (50) 

with G c (0) = 2.216 .... Thus for G > G c (r) we have coexistence of three or 
more quasi-species, while for G < G e (r) only the fittest one survives. 

I have solved numerically Eqs. (|33|-|3^) for several different values of the 
parameter G, considering a discrete genetic space, with N points, and a simple 
Euler algorithm. The results, presented in Fig. 2, are not strongly affected by 
the integration step. The error bars are due to the discreteness of the changing 
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parameter G. The boundary of the multi-species phase is well approximated 
by Eq. (|30|); in particular, I have checked that this boundary does not depends 
on the mutation rate \x, at least for /i < 0.1, which can be considered a very 
high mutation rate for real organisms. The most important effect of /i is the 
broadening of quasi-species curves, which can eventually merge. 

4.6 Final considerations 

Possible measures on this abstract ecosystem model concern the origin of com- 
plexity (for instance considered equivalent to the entropy) in biological systems. 
One expects that steady ecosystems reach an asymptotic value of the complexity 
that maximizes the explotation of energy, while perturbed ecosystems exhibit 
lower complexity. Another interesting topic regards the behavior of complexity 
in relation with the spatial extension of a system. 
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